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Kudv thereby allowing trials exammmg ^^^-Z^ md/or v^ldi increased 
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subpopdations as an assessment of the test pioceowc 
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METHODS TO 



I^XJCE VAKIANCE IN TBBATMENT STUDIES 
USING GENOTYWNG 



, • ♦i,.henefitofU.S.ProvisionalApplicationNo. 
Thisapplicationclaimsthebenefitofli ,^^„,,,^for 

. r. M,pr 1 1998 which is incorporated by reference m 
60/110,668, filedDecember 2. 1998, w 



all purposes. 
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statistics. 
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•-1.C in the fields of medicine, genetics and 
The present invention resides m the tveios 

...««»ofstudiesforinvestigatingtr«atmentefficacysuch 
^econductanddesi^of^^ 



asclinicaltdalsaimstoeliHim^^e ^^^^^^^^^^^^ 

^^^^^^-^'^^'"tLcingbiasistorandornizeindividualstod^^ 
o^erwise.Oneapproachforr«^«^ ^^^^^^^^ 

controlgronpswiththeviewthatifthemdmd ,^^^ental 
geneUcallyandUveindepend^tofo.^^^^^^^ 
„on^etrialwillbebalancedmth tw J^^^^^^^ ^^^^ 

- -..ceofrandon^-^^^^^^^^ 
con^tionmeasuredis greater than if each cas 

environmental influences. 
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.estriction^^entlengthpolyn.o^^-(RFLPUorex^^^^^ 

sequence that alters the lengmo ,,,wsTRs^ as the name implies, are 

^ . lo.^i 4 ^31 fl980)). Short tandem repeats (STRs), as me name v 
Hum. Genet. 32.314-331(1980) ^..^dtetra-nucleotider^eatmotife. Such 

shorttandemrepeatsthatconsistoftandemdi.tn ana ,,^vNTR^ 

^,timesreferredtoasvariablenumbertandemrepeat(VKIU) 
5 polymorphisms arealsosometmies referred to ™s Lett 307-113- 

, W,m.rsee U.S. PatentNo. 5.075.217; Armour etal.PEBS Lett 307.1 
polymoTpmsms (see, v^-^ 
U5(1992);andItoma»l.,WO9m4003). 

B;&rten«.s.conun»nfo,motpolymo^M™a«ftos=involvmgsu,B,e 

„„cleo«d.vana«o.sb.»».na.^ „,„WSNPs. Som.SNP.«oc« in protein 
,„ -casin..n..ec«...V^^- 

p„,«,tiallylto=ameofagenaicd» ,„ b,„™drf«tiv. splicing). 

«,„«h.l«s«ul.ind.fccUv.pn>.ein»ptm.on(^S-.''V«^=f 

Other SNPs have no phenotypio effects. 



15 
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Certainmethodsoftheinvention^designedtoprovideanaasesame..^ 

popu.a.ionahaveb=e„cha,aCeH»dfcrpoi,n,o,pMcp,omeanda,.seie^c^ 
i;haveain.Urpoi,n,orphicprof..ea.Adetc<n.ina.ionia,henn.n^*^*» 

h,so»eins.a„ce,.esp.cianywh«.aai^ific.n.diffcrence«no.fc«n4 
«.ese,ectinga„dde.e™,ining«=ps.rerep.atedcneorn>ore«n.ea..n.nchadd,..onai 
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eyoU.ftep.ly-o.pWP'ofi" ,,_,^f<„ta previous cycles. Thcpo^ 

^^^^ 
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In mthcr aspect, to iBVendon provides various computet syst«m and 
pn,gran,s. For i..»nce. certain con^uterp-oducUfbr assessing a treatment p^cedure 
a„provided.Son,esys.en,si.c.udeprogranproduc,sthatgc„eral.yinc,t^eco^efor^ 

p„vidingorreoeivingdata,.herein,hedaUinc.«d.s: (!) design«.onsfor^m«nb« 
ofa.rea.edpopulationtre.tedacco*g.oa.reatn,en.p,ocedure.ndforeachmemberof 
.co„,ro,popu.ation.reatedaccord^g...co„t,o,proe«lure.(2,designat.onsfora 
p„,y„,orpbicproMefore.chn,en,berofth.,rea.edandco..roipopnla.,o„sand3 
XaLfora.estparan,e.er*.e.cbn,en,Uror.hetr.atedandcon.,o,popui.^^^^ 
0 ;cprogran,a,soinc,udescodeforse,ecti„gasubpopu,a«on.^.eacbotbe.„a..en. 
.dL!I.pop.UUo„s..i.tbaveasin,i,arpo,yn,o,pbicprome,co.e(.e»nn.u^ 

„beth.rthe,eisasta.is«caliysignifican.~ta«>«'-'P™-='«>'="'-* 
^„,aUonsandcod.fordispU,ingano.tp„.«.a.indica.es»>>et>«rasta....c^^^^^^^ 

:jil.differencewasfoundbet«een.*.subpopu.a«ons.Tbccod..s.y^^^^^ 

15 onacomputerteadablestoiagemedium. 

The invention tefterprovidesacoBputmzed system for assessing 

.^en.proc.dur.s.Somesyst.n» generally inc,udean.«„or,,as>^en,b«sand. 

p„cessor.Uep.cessorisoperative,ydispos«ltop.videorrec«ved.U.v,be«.ntbe 

dataindudes: (1) designations for eacbntentber of a treated popuMionhavrngbe^ 
..ea.^aceoraing.oatrea.n,e„.P-du..nd«.e.cb.-berofacont™lpopuM.» 
,™,cdaccordingtoacon.ro,procedur.(2)desi^o„sforapol,™o.ptacpromfor 
.achmemberof.hetreat.dandco«t,«lpopula.ions..nd(3)desig..attonsfbrat«. 

parameterforeacbntemberofthe treats and control populations. The processor.. 
Wierdispos^itosdectasubpopulationftomeachofthetreatmaitandoo^trol 
„ popnla.ions,hatbaveasin,i,arpolyn,o,pbicpn.ffl.anddete.nn„e«b^.^«e.a 

^ stlstica,l,.isnifieantdi*renceinth..estparan,eterbetwe=n.l««^^^^^ 

,„io.oprocessoris.lsocapab,eotdisplaylngano«.p».i.dicati„.whe.hera«.nsncally 

significantdifferencewasfoundbetweenthesubpopulauons. 
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A-t«atn.ontproc«l™ «K ^^^a,««pto«dm=isa 

,^„W„»..yproduoeapro«»).s«^«;P ^^^^^^^ 
^«„».,opuUta«ithavacc>ne. ,«.od.*..« performed 

procedure^— eo,<,»att2;^^___^ 

p,^Uto«oncanreceiveapha.n«««.calco P i^^^eoonttol 
L^eaapUce...,noP-nnac»«»^-^-;^^„^^ 

„ compo.Monre.a«,e«*e«ea»».p."«^- 

::r::i-"---^--* ^^^^ 
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,hc form of a case-conttol audy (tor a discre.. random variable, «>. groups being a««>Kd 
and unaffec.edindividuals)orasingl.popul«ions«dy where ftecauseofftedegree or 

severiV of Phenowe febeing i«v.s«g«^ (for example, a ,mmS«.tiv. study ean 

examine blood pressure, blood glucose, etc). 

A ••treatment study" is an inquiry into the effect or influeneo . parttcular 

,rea.mentprocedurehasonabiologicaleondition.biologicalsuscq.tibiUtyorbiological 
^ceofasubject The smdy can be <p.ite structured, fcmtal and exiensive in scope, 
orc=nberelativelyunstt«cturedandoflimi.edscope.Forexample.atreatmentsmdyc.n 

be a formal cUnical trial or study performed on a relatively large group of subjects 
0 . whereinthestudyisperfom.edaccordmgtoset9.idelines(.,..gove.nmental 

regulations). However.ftetr^.ment study can .lsobeap,.clinic.l.mdy,aaeldtnal of 
,p.an,populationorevenaninfonnalstudybyascie„Ust,ve.erinari«roraphysic,.nof 
the effects ofaU^entonreUUvely few subiec.s.taatreatmemstudy,.hesuli«.s are 
divided into seve,d(ftoughofUniusttwo)groups. These may rep^so. different dos« 

,5 rangesorsimplythetreatedandtheuntreat^lsubiec^. ,n the study, the random vanable 
is messur^iaftertreatmentltmayabobemeasuredbefore treatment if it is.changem 

the variable over time that is being investigated (e.g., bone mineral density or blood 
pressure). It is preferable that subjects are not undergoing »,y ototreatments for.h»r 
pathological condi«on.However,ifsuchacons.raintisunreasonable,.hes.udyshouldbe 

„ designedsothatsubiectsinbothtreat^iandnntre^edgroupsareundergoingthes^ne 
altemattvctreatment. The subiects ofthe treatment smdy can be conducted with any type 
of organism, including, for example, animals Oncluding humans), plants, bacteria and 



vinxses. 



A T,iological condition" refers to the condition, suscepttbility or resistance 
. oftheorganismuponwhichthestudyseekstodetennmewhetherthetreatmentprocednre 
hasaneffect Typically, the biological condition is. physical or physiological condtfon 
ofthe organism, For example, m some insUmces the biological condition. s a 
pathological condition (..e.. a physiotogical ,U«e that nonnally does not exist, such as a 
disease for ex^nple). Pathological condiHons typically studied with dte methods of the 
,„ i„ventionare.hosewi.haminimalenvirom.entalvariance(.g..highcholesterollevelsm 
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TWsnoSonMnextendto'l'*"'" ._„i„„a,esBn, cysts m the Iwer.P 
^•n^tpsramce^ " tlflof.l>.t«atm»tp.««*«*"°8 

^ ■ -o^PtercanbeevaluaxeamH if the biological 

30 concentration of HIV m 
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cpntm cholesterol concentration. 
Th» tarn ••variance- refers to v»naUon,s«ter,8pr6 

th»attcall,>.varianceis*.mea„v.lu.of«.e.,«ared 

devia.ions(A.nntage.P,STATl ,,„„variaBce indicates large deviate 

„,0,.rdU««ns^^m^^^^^ 

'-^'^'"^'Z^^Z^^'^ ^.^^^.^^entati^avera. 
-"^'^•"■"'"t^.lLueve.re.aHve.o.hemean. Other ata^atical m.a»ca 
^uar^dcviaUonofaiicW s-^ ^^^^^^^^ 

ot^ordUpe»,cnabouta»»^«^^ ^ ^^^^^^^^ 
, tes.para.netertakcstKcs.apeofa^n hap^^ ^^^^^^^^^ 
the invention decteasis the yanance and thm natr 
! :lLmat.en,a.ica>i,>edi..n.utic„i=eco.es>epto«.c. 

«t,.evarianceisdae.odissi„..are.ec.a<,n,>.e.,>,ee,st.» 

Typically.t _ ^ ._j^yaafi5ficalmetl»ds,e*.8«»'«^ 
^nue..etheWogica.c»diOonbe.ng-y»dr ^^^^ 

,3 envirenmenul^tdmeastnententvanabiea. J ^„„,«,e entire 

T't:— !.r-c.t.e..eeta,ive. W^o^— 

nrit::e«,.=epp-«.------^'"''"''- 

V W refers 10 tbe occurrence of two or more g«»ucany 

^>'''"°'''"™:IL„.popn,a.iongenera..ysaid.oi.eoc«.ing 
aete,rnincdaitcn.ativese,a.^ ra^=U^^PP^_^ 

.eneticdivergenceoc^^. rrf^ ApCymo'Phioiocuscanbeas 
"°\^rs ct .IurefLdto.sas.giennc,ectidepo,y.norpMs.nor 
25 small as one base pair. Sucti a 
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. repeats, and insertion elements such as Alu. The first identified allelic form is aAitrarily 
desipated as the reference form and other allelic forms are designated as alternative or 
variant alleles. The allelic form occurring most firequently in a selected population is 
sometimes referred to as the wildtype form or allele and the other forms referred to as 
i mutant forms or alleles. Diploid organisms can be homozygous or heterozygous for 
allelic forms. A diallelic polymorphism has two forms. A triallelic polymorphism has 
three forms. 

A "single nucleotide polymorphism" occurs at a polymorphic site that is 
occupied by a single nucleotide, which is the site of variation between allelic sequences. 
0 The site is usually preceded by and followedby highly conserved sequences of the allele 
(e.g., sequences that vary in less than 1/100 or 1/1000 members of the populations). 

A single nucleotide polymorphism (SNP) usually grises due to substitution 

of one nucleotide for another at the polymorphic site. A transition is the replacement of 
onepurinebyanotherpurineoronepyrimidinebyanotherpyrimidincAtransversionis 

15 thereplacementofapurinebyapyrimidineorviceversa. Single nucleotide 
polymorphisms can also arise from a deletion of a nucleotide or an insertion of a 
nucleotide relative to a reference allele. 

A "polymorphic profile" refers to one or more polymorphic forms for 
which a subject is characterized. A polymorphic form is characterized by identifying 
20 which nucleotide(s) is (are) present at a polymorphic site in anucleic acid sample 
acquired from a subject. The profile includes at least one polymorphic form and 
preferably includes aplurality of polymorphic forms, such as at least 5. 10. 20. 30. 40. 50. 
60. 70. 80. 90 or 100 polymorphic forms or more. Polymorphic profiles are similar when 
the polymorphic profiles being compar«l share at least one polymorphic form at least one 
25 polymorphic site. Typically, similar polymorphic profiles share identity of polymorphic 
forms in at least 10%. 20O/.. SO'/o, 40%. 50%. 6O0/0. 70%. SO^/o. 90% or 100% in at least 
10. 20. 30. 40. 50. 60. 70. 100. or 500 polymorphic sites. Polymorphic forms are identical 
if ihe nucleotide(s) at a particular polymorphic site are the same. Thus, two polymorphic 
profiles each including 10 polymorphic forms are 50% identical if five of the 
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p„,y.™phicfon>»mftc«voprom«arcid»M. Btao,gams« is diploid. «,«.fte 
po.,mo,pHcfbnns..eachpo>ymo,phicsi»a«co„sidcrcd»b.ide.tol,n^,o 
individual. ifbo^iadividuaUh.veft.s^.^0 alleles ..*=pol^orpWos,«. For 

exa^pLanindividaalhavingallelesa. anda2a.po,yn,o„hicsi«Ais— » 
, ^ed-esameprofileasanindividualhavingallelesa. and a2 bu. no. «, an md-vdu.. 
having alleles al and'al. or .2 and a2, or al and a3 and so forth. 

The term "linkage" describes Ihetendency of genes, aUeles. loci or genetic 
„«*erst»beinheritedtogetherasaresul.of.heir.ocationonthe^echron,osome,and 

canbemeas^byperccntrecombinationbetween the^vo genes, aUeles. loci or geneUc 

10 markers. 

•linkagedisequibW'or-dleUcassociation-meansthepreferenlial 

^cia.ionofaparticularanel=orgeneticmarta»i.haspedficallel.org»eUcma^^ 
„.nea.bychron,oso,na.loca,ionmoreft«,-.ently,hanexpectedbychance(see.for 

example Wete.B.. Genetic Data Analysis. Sin».er Associate Inc.. .996). Forexampc-f 
, Zx;»sa,i:.esaandb,«hichoccure,nally«,and«..«..oc.aVhasaUeles 

candd,»hichc«c„re,ual,yfie,ue„dy.onewonlde^tt»co™bi„aUonactoo«». 

™.ha«of0^5. „.cocc«rsmorefte,"en.ly>-*'»>-^ = «""»^' 
,ise,oilibri»m.U,*agedisequilibnumma,,cs«l.fh>mnaturalsolecaonofoe^^ 

conrbinationofallelesorbeoacseanallelehasbeenintroducedintoapopulaBontoo 

20 lecenfly to have reached equilibrimn with linked alleles. 

A marker in linkage disequiUbri- can be particdarl, useful in detecting 
susceptibiUty.odisease(or„,herph«,o.yp=)uotwith*ndingtha,,hemark.doesn« 
eausethe disease. For examplcamaAcr 00 that is not ItselfacamaHve element of., 
disease, but which is in linkage disequilibrium^dtageoeCmcludbgregulatory 
« se^ences)mthatisacansa.iveelem«..ofaphenotype.eanb.detectedtoindtcate 
■ ^tibilitytothediseaseinoircumstancesinwhichthegeneYmaynothavebee. 

identiaed or may not be readily detectable. 
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,0 theareaofthetargetDl^A^° „rimer that hybridizes with the 5 end o 

nlelecomplM"™'™'' . ,w directly 

the stringency of the hyona. 
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-■r;rrrr-„..~-"--"----- 

be detected. * oi 

Mocbcmiol, mm.unochemical, ^ ^ 

beusedtoquantitatethcamountofboundlabel. _ 

. is a nucleic acid probe that IS bound, either 

A "labeled nucleic acid probe isanuciei v a label 

. ^ofWaals or hydrogen bonds to a laoei 

— ^^^^^ 

15 such that the presence of the prooec 

! , „„w,oaparticularnuoleotides«,u««,mderstnng»t 
.,«d«n,ofa«o>ecuuo^» ^^^^^^^^ 

principle 5->0 =C.o»cr*an«>e.he™a.»e«i„gpoln. 

(mdsr defintd ionic stienglh, pH, ana mKi 
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variables «P-»=- ^aly... The 

, „fl«^AandaiandomvanableorK ,h,„„(lh6P-valvK)« 

o.-phan.aceu'-'^^^' ^^J^ ^ .enna l^Vud. a 

i»,aorvirua=s. "T^™'"""^"* ' akin or buccal aci.pta8«. 

25 human. Tissues 



.^.U»."rcf«s»>bott.h»n.nand«^bicc.. 
The term patieni 
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General 

The present mv«.Uon provides memod,. computer program, and 

.,fi,l for desiming treatment studies and for evaluating the 

— 'oLresCe.g..c,inica,..>a3a.^^^^^^^^^ 
effieaeyotv ^ ,^i„,™ao„„„ designed to control for underljnng 

' lenotypeofasut^cct "^^Ipor^cto.treamrent. Uepr.«nt invention is 
rrp:oT.i;~-.ei^a-.Vor— .genetic.^^ 

* „,l.bilitv of individuals sharing the same alleles at genes mvoh«l m 

" """l^rrl^herepol^rphismsCu^y^S- 

Zrrr::sedi«erences.response.othetreanne„.d,ese„M». 

influencing that phenotype. In the coniextoi* 

rrespoltoatreatme„..(^=.ic^--'' — ^""^"°"'r 
taZlprincipicunderlvingaremethodsofthcinventionearil^i— 

*LJparametcri,measu,edin.wogronps.thefi,st(.hich.sofs,.»)ts 
\ 3: rrd,ofsi.™)Uun^.e.««--varia^of.^^^^^ 
lhccalcuUtedinthes.andard»,(seeArmit.ge.Bcrry,S,at.st.calM2-- 
MediealRese.,ch.Blaci™e,l Scicnce.1995., THUS, for i.stanc^.n-«™Pl'«>— 
.dvarianceofthetreatedgmupare and sf respectively, and mean and vanancc 
ofthenntreated group are ^ and respectively. T*en an apprc^imate conftd^ce 

. f .omnle « - 0 95 ) for the difference in response between tbe 
25 interval at a% (where, for example a - u.y3; 10 

two groups is given as, 



n m 
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„h«e Z.,. isa,.va.u«f«,.s.»d«dno™,a,di.«b*.«is exceeded b, chance in 

"'"lience any „ee»d« deceases me variance in eit«san,p,e(U..wH* 
dec«a^.;cr4)«-sari.yd^as.sd,e^of*econfidenc.in«™.. 

y,-»-varianceoro„eo,...f«..a^H^^---.^^^ 
con,ide„cein»™.canbeWdco„»Un.»imrewpa«e„.aen,.»ed.n*=ta^.^^^ 

and/or^canbereduced). Thus, reducing. be variance in response can lead Cher » 

;!.ercerUh.,yor.difference,c,= — ^>>--->-°'^--7;^:^ 
Iredecedsan,p,esi.forU,esa»esU.isUca.^wer.Thev.riancecanbereduc.du.. 

numberofdifferentwayaasde^ibedintbefoUowingsections. ^ 

c„„f„„^b.gfac»rsbyi«eash,gmeho.ogenci.yot*epcpuUdon.Ind,ec^^.o^ 
,e.edcs,aselotpo,,n.<^hicn,a,^canbeexan,i™dina,argeg,o«potsub,«:.^^ 
, Lsewid.sin,iiarpoiyn,o.phicpromese™o.,edin*e.rea..en.abad,L.c^^^^ 
ge„edc6c.«(represen«dby«.epo,,n,o,pHcpn>a.e)in.o*=inc— 
„fa— s«»iya»o«anexperi.en»»reduce*eva,iancei„responsedueto 

underlying genetic factors. 

B i ,i m |1i-r rn i'-""" ^Cinllyhnn^or^ous s,bse.s . A 

» secondappreachia»ca«gorizei».ividuaisin»a*sc.dependingonbowsnni,.^ 

p„„™orpbicpn>fiiesare..oneano*er. Wito. each .bse, subject are .^d^nly 
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10 



procedure (eg., administration of a drug) being tested is unlikely to be effective in any 
significant portion of the population, and that further research is not justified. If, 
however, statistical significance is reached for a particular polymorphic DNA profile, at 
least two conclusions follow. First, in the case of a clinical trial on a drug that the drug is 
effective in at least a portion of the population, and fiirther development of the drug may 
well be justified. Second, one knows the portion of tiie general population in which the 
drug is effective, tiiis portion being defined by a polymorphic profile. This profile can be 
used as a diagnostic to identify patients appropriate for tireatinent when tiie decision to 
treat or a choice of treatments is made. 

As an example of a method of tiie invention, a clinical tiial can be carried 

out as follows: 

1. Tdentificatinn and cho ice of polvmorohisms. 
A set of polymorphisms is identified tiiat allow the division of tiie patient 
cohort into sub-groups. These polymorphisms may be known to be involved in the test 
parameter ie.g., tiie phenotype or endpoint) tiiat is to be measured or can be chosen at 
random. (Jin tiie latter case, the genetic sub-groups may show identical results witii 
respect to the phenotype of interest. This implies tiie method of grouping does not 
decrease tiie variance in tiie endpoint and tiie population can be re-analyzed as a whole. 
Thus, stratification by using genetic data does not have a deleterious effect on tiie 
experiment or tiial, even in cases where it does not infiuence the outcome). 

2. flenotvoinp of the cohort. 

Some or all of tiie markers are genotyped in ttie entire cohort of patients 
enrolled in ttie clinical tiial. These data are then used eittier as inclusion/exclusion criteria 
(see 3a below) or to divide tiie cohort into subgroups (see 3b below). 

3a. Tnrliisinn/exclusion of pati ents using genetic information. 
30 If some or all of tiie polymorphisms are known to influence the test 
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or toxic response to a treatment, and may identify by virtue of unresponsiveness, a clinical 
subset of patients that define a "different" disease. In short, apost facto genetic analysis 
correlated with a specific clinical phenotype such as drug responsiveness or 
unresponsiveness can reveal different etiologic mechanisms for the disease being treated. 
5 This is especially likely in the case of ethnic differences among patients where each 
ethnic group has a distinctive response to a treatment. Finally, analysis of phenotypic 
markers can provide insight into genetic diversity of the subjects being treated aUowing 
the clinician to alter enrollment in a drug trial to accommodate more or less genetic 
diversity as is scientifically prudent. 

la . 

m. Methods 

A. General 

In the methods of the invention, members of a treated and control 
(untreated) population having a biological condition of interest (e.g. , a disease) are 

15 characterized for polymorphic profile and a test parameter that is a measure of the 

biological condition, assuming the members have not already been so characterized. The 
members in the treated population have been (or are) treated according to a treatmait 
procedure, whereas the members of the control population have been (or are) treated 
according to a control procedure. 

20 To reduce total variance in the treatment assessment or study, 

subpopulations from the treated and control populations are selected for similar genetic 
composition such that the members in the two populations have similar or identical 
polymorphic profiles. The polymorphic profile of the subpopulations includes one or 
more polymorphic forms. Typically, the polymorphic profile includes a plurality of 

25 polymorphic foms, generally at least 5, in other instances at least 10, and in still other 
instances at least 1 00, or any numba: there between. 

To minimize genetic variance between the treated and control 
subpopulations, tiie polymorphic profiles for tiie two groups are selected to be similar. 

21 
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responsiveness may be more important (and hence given more weight) than random 
polymorphisms. 

The polymorphisms can be in genomic DNA, RNA or cDNA. While any 
polymorphisms can be used, those of particular import are polymorphisms in genes that 

5 encode proteins that directly or indirectly influence a biochemical pathway that is 

correlated with the biological condition being measured or observed. Thus, for example, 
if a study involves assessing the efficacy of methods for treating patients having elevated 
blood cholesterol levels, the polymorphic profile can be tailored to include 
polymorphisms located in genes known to be involved in cholesterol synthesis and 

10 metabolism. 

Once appropriate subpopulations have been selected such that the 
subpopulations have the desired level of similarity in polymorphic profile, a 
. determination is made whether there is a correlation between the polymorphic profile and 
the efficacy (or lack thereof) of the treatment method by ascertaining whether there is a 

15 statistically significant difference in a test parameter between the treated and control 
subpopulations, where the test parameter is a measure or is representative of the efScacy 
of the treatment for the biological condition shared by members of the subpopulation. A 
finding of a statistically significant difference, indicates that the polymorphic forms in the 
polymorphic profile of the treated subpopulation correlate with the biological condition 

20 (e.g,, the polymorphic profile is correlated with a particular disease) and that the treatment 
method under study is useful (or not beneficial) for treating subjects with the biological 
condition. 

As noted above, such correlations are particularly important, for example, 
in clinical trials on a drug. In some instances, the correlation identifies a set of genetic 
25 markers associated with the disease and thus has diagnostic value. In other instances, the 
correlation identifies markers that are associated with a positive treatment result and thus 
are important firom a therapeutic standpoint. 

A statistically significant difference in a test parameter between the 
treatment and control subpopulations can be determined using standard methods of 
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Statistical analysis. Methods include, for example, the analysis of variance, logistic 
regression, cluster analysis, non-parametric statistics, contingency table test and other 
standard statistical tests. 

B. Repetition of Method 

The polymorphic profile of Ihe subpopulation initially selected, often do 
not correlate with a statistically significant difference in the test parameter that is used to 
measure the efficacy of treatment. In such instances, the method can be repeated with 
different subpopulations created by using an alternative defmition or measure of genetic 
similarity, or by dividing the population into greater or fewer sub-populations. This 
reflects the fact that there will rarely be a single unique way to group patients. Indeed, for 
a study with N individuals, it will often be possible to form any number of sub- 
populations from 1 (the entire population) to N (each individual in its own sub- 
population). Repeating the process is often an effective way of detecting which 
polymorphisms within the polymorphic profile are particularly informative with respect 
to the test parameter of interest. Once a correlation is identified, additional cycles can be 
repeated using, for example, a subset of the polymorphic forms utilized in an earlier cycle 
to determine whether the subset might show an even greater correspondence with the test 
parameter and thus treatment efficacy. 

Typically the polymorphic forms within a polymorphic profile evolve over 
time to account for a greater proportion of the genetic component of the variance. 
However, these polymorphic forms generally do not contribute equally. Some account 
for more variance than others; markers that do not correlate with differences in the 
treatment and control procedures are discarded fiom the analysis. The set of markers as a 
collection have value distinct fmm the individual markers. This collection has enduring 
value for understanding the genetic contribution to a distinct biological condition of 
interest. Individual markers can have diagnostic utility, as can the collection. 
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When the methods of the invention are utilized in clinical trials, typically 
subjects in the two groups are not undergoing any other treatments for their pathological 
condition. In other instances, the study is designed so that subjects in both treated and 
untreated groups are undergoing the same alternative treatment. 

5 

D. Trftatment and rr^ntm} Procedures 

The types of treatment and control procedures vary according to the 
biological condition to which the treatment is directed. As noted above, the biological 
conditions can be any of a number of conditions, such as a pathological condition or 
10 simply a biological susceptibility, for example. A variety of different procedures can be 
performed when the biological condition is a pathological condition. In many instances, 
the procedures involve administering a pharmaceutical agent, including, for example: 
1) administering a pharmaceutical agent to members of the tinted population and giving 
members in the control population a placebo or notiiing at all. 2) giving members of the 
15 treated population one pharmaceutical agent (or combination of phannaceutical agents) 
and a different pharmaceutical agent (or combination of pharmaceutical agents) to tiie 
control members; 3) providing one quantity of a pharmaceutical agent to the treated 
population and a different amount to tiie control population, or 4) administering a 
pharmaceutical agent to tiie treatment and conbx)l populations according to different 
20 schedules. 

Instead of administering a pharmaceutical agent, the ti^atinent procedure 
can include some type of behavioral therapy. Examples of such therapy include, but are 

not limited to. a particular diet regime (e.g., low fat. low sodium, high protein, or a 
restricted calorie diet), a prescribed exercise regime (e.g.. exercising for a certain time 
25 period a certain number of times a week, performing low-impact exercises, exercising to 
reach a target heart rate, therapies that woric certain muscle groups), meditation, yoga, and 
stress reduction techniques. Of course, tiie treatinent procedure can include combinations 
of the foregoing procedures as well. Members in control groups may not undergo tiierapy 
at all or may be treated in opposing fashion (or may aheady be engaged in contirary 
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behaviors). For example, if the treatment group is placed on a low caloric diet, members 
in the control group can be placed on a high caloric diet or can simply be selected for 
those whose normal diet already is a high caloric diet and thus is not altered. 

The treatment procedure can also be directed towards a biological 
5 susceptibility or resistance rather than a pathological condition. Thus, for example, in the 
case of plants, plants can be treated with various agricultural agents used to affect plant 
growth or health (e.g., fertilizer or other growth stimulants, herbicides, insecticides, and 
pH altering agents) to assess the effect of such agents on various susceptibilities or 
resistances of plants susceptibility to frost or freeze damage and resistance to 
10 herbicides). In like manner, humans or other organisms can also be treated with various 
agents, for example vaccines, to determine the effect of the agents on various 
susceptibilities or resistances. 

E. Utilitv 

15 The reduction in variance achieved by the methods of the invention 

enables researchers to selectively optimize treatment studies. For example, as the genetic 
variance decreases, the confidence level of the statistical analysis increases. Thus, with 
the methods of the invention, researchers can more confidently attribute differences in 
effects as seen between the treated subjects and the control subjects to the treatment 

20 administered, rather than being consequences of genetic differences: between patients. 
Furthermore, differences between control and test groups can be appreciated sooner. This 
allows smaller, less costly studies to be performed that have the same statistical power as 
much larger studies that do not match for the underiying genetics. Alternatively, a study 
in which patients are matched for genetic factors will be able to detect much smaller 

25 difference in response between treated and untreated individuals than a study of the same 
size that ignores genetics factors. This allows for less costly studies, more rapid 
assessment regarding the feasibility and desirability of additional treatment studies, and 
ultimately, in the case of clinical trials on phamaceutical compounds for example, allows 
for more rapid marketing of the pharmaceuticals. 
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The methods also enable more efficient treatment studies to be designed. 
For instance, once polymorphisms that correlate with pathological conditions have been 
identified, subjects that have the polymorphisms as well as the biological condition can be 
identified and enrolled in additional studies to analyze the effect that other treatments 
have on the biological condition of interest Because subjects that will not respond to the 
treatment are not enrolled, fewer subjects need to be enrolled. Alternatively, if a set of 
polymorphisms emerges that, when matched between patients in a control and test arms 
of a trial, is highly correlative with the biological condition being studied, subsequent 
trials of the efficacy of a treatment can be tested with fewer patients regardless of 
response rate if the biological condition being measured has a genetic component. In 
addition, when polymoiphisms associated with differential response are identified, it may 
be possible to tailor the dose a specific patient receives to be optimal given their 
polymorphic profile. This will be particularly important when there are unwanted side 
effects of the treatment and it is desirable to give the minimum effi cacious dose. 

Furthermore, as noted above, the treatment methods described herein 
permit the identification of subsets of polymorphic forms that correlate with either a 
favorable response or unresponsiveness to treatment, or an unwanted or toxic response to 
a treatment. Clinical trials on the efficacy of certain pharmaceutical treatments can 
identify individuals that are unresponsive to treatment and, in so doing, can in some 
instances result in the identification of a clinical subset of patients that define a "different" 
disease. Such correlations can also be used as a prognostic and/or diagnostic tool to 
identify subjects having or likely to acquire a disease or to select appropriate treatment 
procedures for a subject based upon the particular genetic composition of the subject. 

Information gained from clinical trials in which patients are genotyped for 
a set of polymorphic genetics markers can also be used in other stages of drug discovery 
and development For example, genes shown to be associated with response via the 
polymorphic profile of the patients may be amenable to mtervention and hence represent 
potential drug targets. Furthermore, identification of treatments that show low efficiency 
many non-responders) or that have high rates of adverse events can be identified by 
examining the polymorphism profile of patients in early phase trial s. This information 



28 



wo 00/33161 



PCT/US99/28582 



. v,t>,Pr to take a treatment forward into large and more 

larger trial 

„G V d=pic« a rcpresenativc con-puttr system 10 for 
^^lOtypicaUy-ncludcsabusU^^ ^^^^^^^^ 

connectedsuchasascatmereoviayo or a network interface 44. 

memory 16 or stored on sto^gem Anoftbed«ic« 

I^.e.ce.operaHo.of.hesys^.^enoU.scnW.n^".-- ^ 
H0.2isa„«.us.ra«o..farcpres».Hveco,npu.ersys«™l 

• „„™Ao(isotthepres«.tinva,tion;ho«ever.FIG.2dep,ots6»t 
.^Morperfornnngd^n^^^^J^^^ 



29 



PCT/US99/28582 



WO 00/33161 



,<„ragemeans(s«FIQ.l). i„ connection vrith a computer 

,o«gcd=vicecpa».eof ^n„gd.ta2^ drives, magnetic 

^.e..B.amp.-f-7;7"^„^ ^^^.OcanincMeaddition^^ 
.p,soUd..u.en.en.onr».dbn^»-^:^^^^^^__^^^„^^,^,Oti, 

^rfware such as inpu«o».pu. (1/0) „^ 
external devices such aa a scam-er 60. extern 

peripheraldevices. ,^,otoclndesaco,npu.erhavingaPeatiun.® 
I„sdmeh«..nces,systen.lOmcln 

0 WTMDOWS® Version 3.1, WlNUuwia 

^c^ptocessoHAthatmnaVWDOW ^^^^^^^^^^^^^^ 

^OWS9S.ope...^^-;^^^^^^^„^ 
the invention can easily be aoap 

.=pa.lug,.o..hescopeor.hept.acn..nve^ti^^ 

Ha3isa«owcha«o,su.UMs^» 

,.nve„tion..aasesslngatica.««P«-^^ 
eon,ainsapluralityofdes,gnatio„sfore,^ 

ttc^edaccordrngtoatreatmentproccdur ,„„„y. He„ce..he 

^^onhlolo^calcondMonXthedata^^-^^^^^ 

,„ identic eachmen^erofthetwopp ^^^^^^^^^j^, 
.ea.gna«onafor.po.,n.o.h.p™M-^^^^ 

population. Subpopulationa&omthetrea step 104, a 

eLtionstepl02.tsinti,aritylnpo,^on^^«^^^^^ 
,3 dcte»lnationis«adewhethetthe.»a — 

'-"'"^.""t«---l.ea.th.e..,ogicalc.a^^ 
the polymorphic profile Of the subp P ^,„^^t of the result of the 

«''':lt::d"c^!«.«.*-,ysls. .anoptiona, 

— ^^"'''"t :rar>^callysl^«cantdlff«««ein.he.c.^ 
30 decUionals.epl08.iffhcre«notasUt«t, 



30 



wo 00/33161 



PCT/US99/28582 



for the two subpopulations, the selecting step 102, the determining step 104, and the 
displaying step 106 are repeated using subpopulations that have a polymorphic profile 
that is different from that in earlier cycles. 

Hence, the microprocessor in the computer system of the present invention 
is operatively disposed relative to the system memory, the system bus and the 
input/output so as to perform the foregoing functions. For example, the processor 
provides or receives data that comprises designations for each member of the treated and 
control populations, as well as designations for a polymorphic profile and a test parameter 
for each member of the two populations. The microprocessor is ako operatively disposed 
to select a subpopulation from each of the treatment and control populations for similarity 
in polymorphic profile, determine whether there is a statistically significant difference in 
the test parameter between the subpopulations and display an output of the result 
obtained. 

The computer program of the invention includes code for providing or 
receiving data comprising the various designations for the identity of the members of the, 
test and control populations, their polymorphic profiles and test parameter results. The 
program also includes code necessary to perform the selecting, determining and 
displaying steps set forth above. 

V. Methods for Determining Polymorphic Profiles 

A. Preparation of Samples 

Polymorphisms are detected in a target nucleic acid from an individual 
being analyzed. For assay of genomic DNA, virtually any biological sample (other than 
pure red blood cells) is suitable. For example, convenient tissue samples include whole 
blood, semen, saliva, tears, urine, fecal material, sweat, buccal, skin and hair. For assay 
of cDNA or mRNA, the tissue sample must be obtained from an organ in which the target 
nucleic acid is expressed. For example, if the target nucleic acid is a cDNA encoding 
cytochrome P450, the liver is a suitable source. 
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^5 AnRFLPcanbe i„ one embodiment of the invention, 

specific phenotypic trait. 
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monomer units are added after amplification to specific nucleotides or to non-amplified 
nucleic acids prior to separation on the basis of size {e,g., by capillary electrophoresis). 

5. Isozyme Markers 

Other embodiments include identification of isozyme markers and allele- 
5 specific hybridization. Isozymes are a group of enzymes that catalyze the same reaction 
but vary in physical properties resulting firom differences in amino acid sequence (and 
hence nucleic acid sequence). Some isozymes are multimeric enzymes containing 
slightly different subunits. Other isozymes are either multimeric or monomeric but have 
been cleaved fi-om the proenzyme at different sites in the amino acid sequence. Nucleic 
10 acid variation of isozymes can be determined by hybridizing primers that flank a variable 
portion of an isozyme nucleic acid sequence to target nucleic acids contained in a sample 
obtained firom an organism. The variable region is amplified and sequenced. From the 
sequence, the different isozymes are determined and linked to ph^notypic characteristics. 

6. Amplified Variable Sequences 

1 5 Amplified variable sequences of the genome and complementary nucleic 

acid probes also can be used as polymorphic markers. The phrase "amplified variable 
sequences" refers to amplified sequences of the genome that exhibit high nucleic acid 
residue variability between members of file same species. All organisms have variable 
genomic sequences and each organism (with the exception of a clone) has a different set 

20 of variable sequences. The presence of a specific variable sequence can be used to predict 
phenotypic traits. A variable sequence of DNA can be amplified (e.g., utilizing the 
amplification techniques listed above) by template-dependent extension of primers that 
hybridize to flanking regions of the DNA obtained fi-om a subject. The amplified 
products can then be sequenced. 

25 7. Allele-Specific Primers and Hybridization 

An allele-specific primer hybridizes to a site on target DNA overlapping a 
polymorphism and only primes amplification of an allelic form to which the primer 
exhibits perfect complementarity. This primer is used in conjimction with a second 
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primer that hybridizes at a distal site. Amplification proceeds fromi the two primers and 
produces a detectable amplified product that can be characterized for the particular allelic 
form present in a nucleic acid sample. See, e,g., Gibbs, Nucleic Acid Res. 17:2427-2448 
(1989) and WO 93/22456. 

5 8. Single-Strand Conformation Polymorphism Analysis 

Alleles of target sequences can be differentiated using single-strand 
conformation polymorphism analysis, which identifies base differences by alteration in 
electrophoretic migration of single-stranded PGR products, {see, e.g., Orita, et ah, Proc, 
Nat'lAcadScL t/iS/i 86:2766-2770 (1989). Typically, amplified PGR products are 
10 denatured {e.g., according to known chemical or thermal methods) to form single- 
stranded amplification products that can refold or form secondary structures, depending in 
part upon the base sequence of the product. The different electrophoretic mobilities of 
single-stranded amplification products can be related to base-sequemce difference between 
alleles of target sequences. 

15 9. Self-sustained Seouence Replication 

Polymorphisms can also be identified by self-sustained sequence 
replication. In this approach, target nucleic acid sequences are amplified (replicated) 
exponentially in vitro under isothermal conditions using three enzymatic activities 
involved in retroviral replication: (1) reverse transcriptase, (2) RNase H, and (3) a DNA- 
20 dependent RNA polymerase (Guatelli, et al, Proc. Natl Acad ScL USA 87:1874 (1990)). 
By mimicking the retroviral strategy of RNA replication by means of cDNA 
intemiediates, cDNA and RNA copies of the original target are accumulated. 

10. Arbitrary Fragment Length Polymorphisms f AFLP't 

Arbitrary fragment length polymorphisms (AFLP) can also be used as 
25 polymorphisms (Vos, et al, Nucl Acids Res. 23:4407 (1995)). The phrase "arbitrary 
fragment length polymorphism" refers to selected restriction firagments that are amplified 
before or after cleavage by a restriction endonuclease. The amplification step permits 
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primer and at least one nucleotide (typically labelled), that is complementary to the base 
occupying the polymorphic site in one allelic form. If that allelic form is present, then the 
primer is extended and becomes labelled. In some methods, biallelic polymorphic sites 
are analyzed by including two differentially labelled dideoxynucleotides respectively 
5 complementary to bases occupying the polymorphic site in first and second allelic forms 
of the target. Analysis of label present in the extended primer indicates whether one or 
both of the allelic forms are present in a target sample. 



C. High Throughput Screening 

10 In some instances, identification of polymorphisms is done by high 

throughput screening. In one embodiment, high throughput screening involves providing 
a Hbrary of polymorphic forms of DNA including RFLPs, AFLPs, isozymes, specific 
alleles and variable sequences, including SSR. Such ^libraries" are then screened against 
genomic DNA from the subjects in the treatment study. Once the polymorphic alleles of 

15 a subject have been identified, a link between the polymorphic DNA and the treatment 
effect can be determined through statistical associations. 

Such high throughput screening can be performed in many different 
formats. For example, for those methods involving hybridization reactions, hybridization 
can be performed in a 96-, 324-, or a 1024-well format or in a matrix on a silicon chip. In 
20 a well-based format, a dot blot apparatus is used to deposit samples of firagmented and 
denatured genomic DNA on a nylon or nitrocellulose membrane. After cross-linking the 
nucleic acid to the membrane, either through exposure to ultra-violet light if nylon 
membranes are used or by heat if nitrocellulose is used, the membrane is incubated with a 
labeled hybridization probe. The membranes are washed extensively to remove non- 
25 hybridized probes and the presence of the label on the probe is detennined. 

The labels are incorporated into the nucleic acid probes by any of a number 
of methods well known to those of skill in the art. In some instances, a label is 
simultaneously incorporated during the amplification procedure in the preparation of the 
nucleic acid probes. Thus, for example, polymerase chain reaction (PGR) with labeled 



38 



PCTAJS99/28582 

WO 00/33161 

ptocrsorUbcWn— spo ^^^^^^^^ 
i™„™«he™cd.d=cm»l,opuc.loro ^^^^^^^^^^^^^^^ 

. ^iolabeu^ddccudosingphotographicfilmor 
..>„ed«sh..Bnzy-c.*.Ua«.,p.*«^^^ ^^^^^ 

LTD. (OS*. -^r^Z. Hlten-PaCca* P-o AUo, C.if.) « 
n,Z,ma*a^««on.Hopta.on,M^ „ftt,e *ove donees 

»,difioations to these devices (if any) so tlBt tn y 
apparenttoperso.«ddll=ata«»«>«"""* 



39 



PCTAJS99/28582 

wo 00/33161 

■ w»v,«m>ui!hPUlsor«miig systems themselves »« 
'"'^^"""■'"*''ti,Con...Hopkinton,MA;AirTechme., 

co^neKirily av«e -8-. )™ ,^i,i„„ Syst«»s. 

,a..Hes.Mento.OH;Bec,=m» — . 

,„c.,«i*MA.e.c.). ■^-^':;^'^i„.^a«<^».to>resd»g,of 
toemicrop...eorme»br»emae.ec»K W^^^ ^^^^^^^ 

---'°^rTr::itrs;msp....t..ea...<^.s«» 

and customization. Themanma 



various high throughput. 



0 



■ „fn>aAcanalsobeidentifiedbyhyhridizatxonto 

Polymorphic forms of DNA can a 

me examples of v/hich are described by WO 95/119^:. 
nucleicacidarrays.someexamp i„ one variation of the 

C„tedbyreferenceinitsei^^f^ P 

.3 invention .soUdphasear^ys^-Je - 

polymorphicnucieicacids. Typica y> Either the probe, or the target, or 

andatargetnucleicacidishybndized othep^^ ,,^e target is labeled, hybridization is 

both, can be toeleJ. typicaW " _ nybritotion is 

,.„.edhydetecans.o»„dfl.o.^J^'J^^___^__^^ 

„ ^^Metect^ahv..^.^.^^^^^^^^ 
and .hearget are labeled. de.ecn°» J 

-----r ^^,^^^^,^,.„,,....e. 

„„eleicaeidshasbee„descnbed«^«. V ^^^^ 
„ «i:767(199.);She.d».e,a/.a".C»- V Sc. al». K*e.. e, «l. 

PCr/US95/16155 (WO 96/11958). In „ ^rfer c«stom-n»de 

p^hea^ysusingavailabletechni^^.-"'*"""" 



40 



PCT/OS99/28582 



WO 00/33161 



. ^„,manuftc»«ssp»alizmg in array manufect^c. 

.ppUcation. For example. ^l-T'^" ' j„,„„tovesWtamelU.« 
.„,ea^V.^..™--^^f^fJ^,:t,.,.^of*epro.es.rea.i..eds„ 
„™pera»resfora,.of«.epro,>.s. ;^^^ i;-,„„^,3,„,,reo.oae,y«ar 
.hat.hemeltingtemP""™^'"'""'* L«ded «. achieve a particular Tm «here 

<aifferen..e.g*s^raiffere„.P-^«-|^ ^^.^^ 

,ifferen.pr«>=eai.ved.ffe».GC^«^J 

ce„,iaera.ioninprobedes.gn,<»l.erfactors 



eonstruction. 



15 



20 



E. CsElLJJi Tr^,..„ds for identifyingpolymorphisms 

.aeacri..a.v.c«^»-^^^^^_^^,^ 

i„,„,vesi»basedsep«alionsCe.g.,W techniques are descdWm 

.ectrcphor^iscanbeuscdtoana^J^^^^^^ 

delailinU.S.Palent>*»- „,ves are filled with the s*arationroatax. 

Briefl.oaP«='-^^;„,,„,„,,,„^a„.optiona,.,f^^^^^^ 
™„cpar.Uonnta.ri.««..n3^-- V^^^„^^^ 
TheBFLPorSSRsann-lesare 0^ ^^^_^^^^„^.,Ur, 
Becauseafthesmal l».»»«>°f^'-'''»*^^„,^„,„3i^^merefbrethe 

describedhercin. ,^ i,^,«n,ed to microcharmel plates. These 

,,3.eshavechanneUlessthanl2' ^^^^^^_^^^^„^,,.,,^ds,na 

separation matnx. Using 



throughput format. 

41 



1 



wo 00/33161 PCT/US99/28582 



In another high throughput format, muhiple capillary tubes are placed in a 

capillary electrophoresis apparatus. Samples are loaded onto the tubes and 
electrophoresis of the samples is run simultaneously. See, for example, Mathies & 
Huang, Nature 359:161 (1992). Because the separation matrix is of low viscosity, after 
5 each run, the capillary tubes can be emptied and reused. 

The following examples are offered to further illustrate specific aspects of 
the present invention and are not to be interpreted so as to limit the scope of the present 
invention. 

EXAMPLE 1 

10 Effect of Genetic Matching on Sample Size and Confidence 

The invention can be illustrated by the example of studying serum 
cholesterol and the effects drugs may have on this biological condition. It has been 
established that up to 80% of the variance in serum cholesterol can Ibe attributed to 
genetics (see, for example, R. A. King, J. I. Rotter & A. G. Motulsky, The Genetics 
15 Basis OF Common Disease, Oxford University Press, 1992). 

The utility of matching patients at genetic loci is that the confidence, sample 
size or discriminating power of a given study can be favorably affected by genetic matching. 
For example, if a study is required to have 80% power to detect a difference of 20mg/dl at 
the 5% level and za =1 .645 (representing the value fi^om the standard normal distribution 
20 which is exceeded in 5% of cases) and zb = 0.842 (the value of the standard normal 

distribution that is exceeded in 80% of cases) the minimum sample si^e may be calculated 
as follows: 

za <- qnorm (al) [,95 @ 1 .645] 

zb <- qnorm (be) [.8 @.842] 

25 If the genetic contribution to the variance of a variable x is 80% and the variance is 1600 
(the standard deviation squared for cholesterol) then the sample size is: 

2x(za + zb)^x variance 

(difference)^ 
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or 



or 



2v(249y xl6QQ= [16001 
^ (20)^ [400 ] 



50 



orn, nf the study are needed to see a difference of 



10 variance is i 



15 



[,««x,80%=.ml600-1280 = 320;320 400=.81 



I „f^»ts(50)in«chann.™ihgcneticmatchmg<he 

p„w„oftt.csmdymcr«s«ft»m.8.ogrea _ 
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20 TABLE 1 



Matching % 
10 



20 
30 
40 



50 



Variance Reduction 

128 
256 
384 
512 
640 



New Variance 
1472 
1344 
1216 
1088 
960 
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PY AMPLE 2 



Use of genetic analysis to reduce the sample size 
necessary in clinical trials 

The foUowing example is iltattative of the meftod of iden«edng 
„„der,yingge„eUe6c.crs«W>uenoeU>eresponse,o — -a*euseof.h« 

iDfotmationinthedesignofclinicaltriate. 

to fte iitsnce of two dWinot genetic sub-pop»latioas A a»d B, aaaoca«=d 
^.h.io«responaea„dhi*^on«.ot««n,ent.«speoUvc,y,.h.«spo,«eoft«a.ed 

tadividualsftomtheWsub-populaUon. A.ha.mean andvariance. .J. Inthe 

..ond suh-pcpuiation. the mean vadance of «spon.e a« givea by and 4 . 

, ^„U,e,y. Koaa.nn,p.ionian»deahonttheshapeofei,herdistnT,u.ton(...mey^ 
Jhavetobenonnal,. to con.™, individnai, who do not reeeive the treatment (tnstead 
^eivingapla^bo^emeanat^varianoeofreapomeiagivenby CaA) ^ 
r.»*l>fbrpopu,ationsAandB.respeotive,y.menasamplei=U.enfromoneot 
«,es.popu,aUons,U,edia.ributionof.hesamp.emeani3nonna„yaistn-bn.ed. For 

.0 examp.e.ifasampleof«» ^ of treated individnds is drawn ih.n.suVpopu,a..o„ A. the 

samplemeanhas distribution Nor».tAi^,<r5 

If the genetiebadcground of the sub-populaUons is ignored, both the 
,^ed(case)andnntrea.ed(con,.».)popnlaUonsinclude.mixtureofindividuals 
^ptedt^mthetwodistn-bntions, Whenthepr<.abi,ityof«,individna,ben.geho»n 

A- or„<tiit»nrobabilitvofan individual being selected 
25 from genetic sub-population A is p and the probaDiiity 

• u „i,t5onBiso=l-P,thedistributionofthemeanresponseofa 
from genetic sub-population B IS g - 1 p." 

sample selected from the two populations can be described. 

For example, when N is the total population size and {xi.ac2....,^,-.."^N) 
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is a set of random variables, each describing an individual in the sample, the expectation 
of the mean response in the sample is given by, 



. N 

m^-jI.Ei^i] (1) 
1=1 



where E[xi]-^^ with probability /? and E[xi]^ju^ with probability 5 = l-p. So 
the mean of the distribution of the sample mean is, 

E[x] = PMa'^^Mb^ (2) 

10 - . 

The variance of the mean response in such a case depends on both the 
variances of each of the two distributions (A and B) and on the difference between the 
means of these distributions. This variance can be expressed in terms of the sum of the 
variances of the individuals, 

15 



nx-]=^in..]=-^. (3) 



V[xi] 



When a random variable Y is defined such that when 7 = 0, V[xi | y = 0] = and 
when Y = l, 1 7 = 1] = o"| , the expectation of this variance is then, 

20 

W[xi\Y]] = pcTi+gal. (4) 
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■pvaC^tT, E[Xi\Y -yJl-MA 



V[E[Xi\Y]] = PMA-^9ftB 



By Standard theory, n*il -£l»'L^. 1'^^ 



^iNcr^im Thatis.thevariancewhenthesub-populattons 
I.po.an«y.n.n>M.l.. A^.^ ^^^ 

, -ignoredisalwayslarserthanthev^- variance is the weighted su. of the 

intuitively be seen from the shape of equauon 6. Th 

• = (rrl ah plus a term representing the difference 
two population specific variances (cr^ .(Tb) pi 

^ x2 Thus.thevarianceconsistsofboththe 
between thetwopopulationmeans • 

^«ndBisp•9)isnormal(bytheCentralLumt 
individuals from sub-populauons A andB IS p. ^) 

Theorem)andhasmeanandvariance {pMA^<lf^B> AT 



u ,^.hv examining genetic markers. Generally, the more 
^fl^tar. genotype. >hegrea«rm p , „ . „mdom mix 

of the two populations with ratio p. g ofmdiviau 
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the two genetic backgrounds have the same frequency, then p = ^ = 0.5 and the 
distribution of the sample mean is characterized as, . 



^ 2 ' 2N ^' 

5 

In some instances, all individuals from the sub-population are genatyped for k equally 
informative markers. This is sometimes the case when markers are chosen at random (i.e. 
if nothing is known about genes involved in responsiveness). Additional markers will 
usually provide decreasing information (i.e. though the A + 1 th marker increases the 

10 probabihty of correctly assigning an individual to a sub-population, it provides less 

information than the A: th marker); this does not necessarily have ta be the case but often 
is the case. For example, if there is a priori knowledge of the genes involved in response, 
these are typically examined first. Altematively, if there is no infomation about the 
underlying genetics of response, then the genetic matching is based on relatedness (i.e., 

15 the overall degree of genetic similarity in the genome) and hence the first few markers 
will be highly informative with diminishing mformation from each additional markers. 

Consider a simple model where the probability of assigning an individual 
to the correct genetic sub-population when k markers have been genotyped is given by, 



1 k 

20 P[correct\k] = ~{\+ ). (8) 

2 ^ 1 



As k tends to infinity, the probability of correct assignment asymptotes to 1. This 
probabiHty can be used in the equations above to detemiine the mixture of the sampled 
population. For * -> oo, p-^l in equations 2 and 6, so the mean and variance of the 

25 sample mean is given by (//^ ,a^/N) for population A and {jib>o\iN) for population 
B (using the information to set = 0 and so select non-responders) as expected. 
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. f -.oi trial with geneticinatdaSSi 
t^mS^^i^'^'^^^'^'''^^~-^^,t,^K halt of«hom are 



treatment and hail wn .u«*^,t.^ samole is normally 



given 



the 



-frequent, then the responsem 



the treated sample 1 



populations are eqm- 
distributed withmean and variance givenby. 




(9) 



10 In the control 
and variance, 



again normal with mean 



(10) 



15 



fsimplicity.thetwo samples are 



selected to be of equal size, but this does 
porthesakeofsimpuc^ty.u..^^^"--"^-^ ^^^^^ 
.thavetobeso.men-— 

distributionscanbeused to how h^^^ ^^^^ ^^^^^^^^^^^ ^ 

response between thetreatedand placebo gro p 



. the trial giving 2N as the total number of 
diffirence between to means of the Mses an 
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are identical /i^ = //^ and fiB-f^B^ *e sample size is infinity). The sample size also 
increases as the variances of the sub-populations increase. 

D. Example of the sample size for matched and unmatched populations. 

In one instance, the two genetic sub-populations have the same response 
characteristics when no treatment is administered (/i^=//5=0,or^= 0-^=8) and the 
mean response to treatment of individuals from group A is described by /^^ = 5,<7^ = 8 . 
Response to treatment for individuals from group B is the same as'for the placebo 
they are non-responders) with fi^ - 0,0-5 = 8 . Further, in this example, a 5% 
significance level {a = 0.05 ) is used and the sample size represents the minimum number 
of individuals needed for 80% power {p = 0.8) . Table 2 below giv^ the number of 
markers ), the probability of the selected individual coming firom group A (/.e., being 
correctly identified as a responder) { p ), the variance of the sample mean for the treated 
population ( V[x\ ) and the sample size required in each arm of the trial {N). 



TABLE 2 



k 


P 


V[x] 


N 


0 


0.50 


70 


83 


1 


0.75 


69 


36 


5 


0.92 


66 


24 


10 


0.95 


65 


22 


00 


1.00 


64 


20 



In this example, the within population variance is fixed at 64 for both 
genetic subpopulations and in both treated and untreated samples. Table 2 shows that, 
when no markers are genotyped, the variance is 70. This increase in the variance is 
entirely due to the difference in the response due to the underlying genotype. When this 
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U.ecoum«.fcr(whe„ .-)>evari.,ce«.»n>s.ott,e«,p«.Mva.ueof64. Tins 

Utt,e,,mplc«zedo.no.increas.Un=arlyina.asi„g™anc=.bu.r..berw,«,ftc 
me variance), mere 83 tadivi*...ar=n«ded»e*=ha™of*.«B.. no 

iedc fafonnaUon is .vailab,. only 20 individual are reeded if indivrduais can be 
correctly assigned as responders using teir polymorphic profile. 

l!YA |M?LB3 

Effects otlinkage disequilibiinm andraarker allele fte<pencies 
^ on the power of a clinical trial 

in this example, there are t«o genetic sub^^oputations A and B. and a 
si„glebi-alleUcSNP(singlenncleottdepotyn,orphism).hatispres<=ntmboms„b- 

populations. One of the alleles, labeled i . has ftequcncy pU \A] = Pa •» ^ 
. popnMonAand pU|Bl=P. insub-populati^tB. .„thispardeul,rexa^.ple..he^»o 
snb-populatio„sh.vee,„a,<^ency(i.e. pU) = p(B) = 0.5). M this simaUon. *e 
p„p„,aHon.widefte,ueneyoftherarealle.e(te...healleleofthepairthathasd»io™ 

„inthegeneralpopu.a.ion)isthen. p(g) = (P.^PB)'^ Inthis»«n.ple.t.has 
.^nsho^thatthe allele g is associated with increase response to the treatment The 
u^reasedresponsecanbeduetothefhnctionofthealleleitseltbulmoreusuain-^^^^ 
auet„d.eallelebeingi.linl.gedise,uilibrium™.hsome(unknown)gen=.cfectoror 

mutation. Themagnitudeoflinkagediscuilibrimn. rf.isdefinedas 

,,.piAi.,)-M) forsub-po^lationAand =p(B&g)-P(«P(g) 
sub-popuUdonB. If the factor casing increased response is more commonmsub- 

„ populationAandtheSNPm.rkerisineloseproximi.y.itisexpectedtha.a.>^..I» 
.hUsense.th.s„b^opul.Uonse»,beregardedasc.rrying.hehigh-re.po,«e(A)and 
,ow.response(B)al>elesofthe«.knowngene.hatinfluencesanindividuals.esponseto 

treatment. 

In some instances, an SNP lies in a region that is kno^ to harbor a gene 
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aiey corns from sab-popuUnon A), wnai 
p(^&g) = 0.3.»d = 

and ^2) 

,5 Thatis,theiei»ap<»«>«»««»«'° •„;„„ hstween e and sub- 

, • A r^versdv there is 

popu,a«onA)givc«mei«.ividual»-«fl.e. .Uele. Th.s.8.v 



20 



and similarly, 
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(4) 



„pr«ed (using Bayes'ftcorem) as, 

(5) 



(6) 

■ p[B&gl._£j — 

pis I si =■"^57" p^+p> 

"*'^"TTo^(siv»«^-s„b-p<>pu,.a<.»a« 

disequnibrirnnbrtw^nth'""^' „.„5 ,»«s»ti«gr»dom««>pUngof 

.sd«cribcdi»*.P«'i<»^»»™'"Y' 075 r^pr^entings^plinguaingtoe 
i^^d..,s6«n.fte»opop«.aSo-a^r-»" 

information ftom the g»»ttc n,a*er. ^^^^^^ ^ 

'^"''"'"""r^J^LUcoaseimepoiy.o.bi^lsa.a*, 
fc. WO populations .rekno«n.Ttasw. „ ..^.Howev^.itinay ofto 

be th= case that previous trials can 

"""■'^'"^.^ethodcxtendsnaturaHytomulUple— crs. Biti^is case. 

25 
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ntsthcgenotypeofanindividuaUtasetofbi-alleUcxnarlcers 
Ge(g......gJ-presentsthegenotyp , . i„3ub-populationB. Then.by 

n Un sub-population A and {q„.-M m suu p f 
frequencies {pp...,P») ^"^s^^P 

Bayes* Theorem, 

tlP> 



(7) 



expressed as, 



(8) 



10 



D\\J\"ifi.''J ■ = -I k 



. .Mi„th=ringlelocus.a«.<.d«len»me.hepro«iUtyof 



allocatmg an 



15 



are en 



probability of «»Mivi<l«alW<mg.ng. 



',l*.c..nofto5„,arkcrsisas*o«ninT=blc3. 
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which the rare allele 



is seen. 



pFobabUity^fbe^^ 



0.247 



10 



' .u^biiitv of correctly assigning an individual to 
Tables showsthattheprobabihty Of c ^ ' ed. In 

ranidlv with the number of markers genoiyF 

p„b*iUty of bekmging to s»b-pop»l.««n A ^ 
„ le>ep.«..of.....-^-^^^ 

^.^o..«.>,>---P2n^I.^o«.on,,U,«»a«v.».^^ 



20 
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EXAMPLE 4 

Effects of the number of markers and their allele frequencies on the power of the 
polymoiphic profile to discriminate between distinct groups of patients. 

5 In this example, there are two types of markers, those with common alleles 

(/.e., the two alleles are at similar frequency) and those with a rare allele. For the first 
marker-type, one allele has frequency = 0.5 in the sub-population A (responders) and 
9i = 0.4 in the sub-population B (non-responders). For the second type of markers, one 
allele has frequency ^2=0-^ '^^ sub-population A and = 0.08 in sub-population B. In 

lO both cases, the rare allele has a 20% lower frequency in sub-population B compared to its 
frequency in sub-population A. A set of markers, K in number, are genotyped in a sample 
of 2000 patients who are known to belong either to sub-population A or sub-population B 
(from a previous clinical trial). For each of these individuals there are k observations 
X, , > ^3 which take the value 0 if the rare allele is present and 1 otherwise. These 

15 individuals can be classified into sub-populations with = 0 if they come from sub- 
population A and >; = 1 if they belong to sub-population B. Using suich traming data, 2000 
individuals were generated and assigned to one sub-population or the other using a linear 
logistic model (Christcnsen, Log Linear Models and Logistic Regression, Springer Veriag, 
New York, 1 997) of the form, 

20 

Other statistical methods (such as described in section those in Example 3) can also be used. 
25 This linear logistic model was chosen to illustrate another method of classification. 

Table 4 gives the probability of assigning an individual to the correct sub- 
population for 2, 5, 10, 20 and 50 markers. Values are given for both types of markers and 
for a mixture of the two. In this example, all markers are assumed to be independent of one 
another. If this were not the case, other, more powerful, statistical methods can be appHed 
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(f„r««™p.e.meU,od.ofol»iBcaHon««s(B»taM«d,C..^caao„a„dReg«^^ 
Trees, CRC Press, 1984). 



TABLE 4 



0 



Number of markers (fc) 



0.50 



0.50 



0.50 



0.58 



0.50 



0.56 



0.58 



0.62 



0.53 



0.58 



20 



0.55 



0.57 



0.66 



0.56 



0.62 



50 



0.75 



0.57 



0.70 



00 



0.84 



0.62 



0.77 



Common allele 
p, =0.5.9, =0.4 

Rare allele 
p, = 0.1, =0.08 

Equal mix of 
p,=0.5,g,=0.4 

and 

Pi =0.1,9, =0.08 

Inthesesimulations,themarkersforwhichthetwoalleleshavesimito 
fluencies aremoreeffectiveindeterminingwhichgroupandindm^^^^ 
aremarkers^ithvery different allele frequencies. An^^^ 
two classes provides, as expected, intemiediate results. 

In many cases, data from previous clinical trials is available, but the.^ is no 
information aboutwhichofthetwo sub-populations theresponders and non-responders are 

drav^ifiom. Such a scenario is more amenable to analysis by clustering methods. Datafor 
2000 individualswassimulatedusingthe same allele frequenciesasdescribedabove.!!^^ 
datawereanalyzedusingK.meansclusteringNthK=a)toinvestigatehowv.e^^ 
populations canbedefinedbythemarkers alone. Becauseless information is avatlable. 

thereislittlepo.ennthismethodwhenveryfewmarkers(.<10) are used. Results are 
15 shown in table 5 for 10 or more markers. 



10 
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TABLES 




5 assign in simulations, 
information was available a prion for 

,„ oonduc. . r^naly- of d«. to ^ ^ 

subsets, the members ol a sao^^ 
e.d,o0.prtonm».b»mdiff«=n.s»bse.- 



10 



15 
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It is understood that the examples and embodiments described herem are for 
illustrative purposes only and that various modifications or changes in light thereof will be 
suggested to persons skilled in the art and are to be included within the spirit and purview of 
this application and scope of the appended claims. All publications, patents, and patent 
applications cited herein are hereby expressly incorporated by reference in their entirety for 
all purposes to the same extent as if each individual publication, patent or patent application 
were specifically and individually indicated to be so incorporated by reference. 
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^»cn. P— »^ TllpopuU^o.^ having b»n oh«».«ized for 
polymorphic profile, ana I" 

) treatment procedure. . 

, naseeondtreatedandoontrolpopnlaiion. 
, cycle otthe selecting and de.enn..«ng^<«" 

, Themelhodofcl.iml.«hereinthe«atedandcont..l 

,TedC.in>ii..i.y.o.««p«'»-'>n''-='"«'=-""**°"^,°^ 

4 profile. . 

4 control population. 

. T..n,e.hodofcl»n,l«herei.*etrea.n.en.procedur.co,npi.es 

!lgen.«.*em«nber.ofU>e«a»apopu.aaonand.he 

^.„i„isteringapi»nnaceut,c.U«^^ ^^^^^^ 
eontrolprocednrecompnsesadmtmstemg P 

population. 
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fl,e fiislpharmaoeutical agent to them 

3 ^.„e.hodofoUin>7.whe«in«chof«.efl«.».*s.co„d 

,„tMasents.a— „ofp~.c..ase.ts, 
, THe.e..oao..^....e^— 

population. , 

tWofclaiml wherein the treatme«tprocedurecompnses 



5 



from the first schedule. 



„ ^„e*odofe,ain,.«he.i»*e,r.a»ne..p™ceau«compri3e. 

1 

2 a behavioral therapy. 



1 

2 a diet regime. 



X.e,ne.hodofc.in,U.whe«to.hebe>«vio«.,he«py«mpri,es 

„ neme.Wore>aUnl,»h^.He.es.pop...tio„-,^- 
1 * .^«re comprises administermg an agricultural 

, p«tyotp.an..and«».rea»e.tp™c^-»J 

, agen.«.thcp.u«U.yofptents.the.Bncul<«r.Ia*e,*»^ 
, ,U«e,ani«.cc«e»dagro«th.sa— agent 



13. 

2- an exercise regime 
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1 


15. 


The method of claim 1, wherein the subpopul ations are humans, 


2 


animals or plants. 




1 


16. 


The method of claim 15, wherein the subpopulations are humans. 


1 


17. 


The method of claim 15, wherein the subpopulations are plants. 


1 


18, 


The method of claim 1, wherein the subpopulations are bacteria. 


1 

I 


19. 


The method of claim 1, wherein the subpopulations of subjects are 


2 


selected as having been similarly exposed to an environmental factor. 


1 
1 


20. 


The method of claim 1, wherein the subpopulations of subjects are 


2 


selected as having,been differentially exposed to at least one environmental factor. 


1 
1 


21. 


The method of claim 1, wherein the subpopulations of subjects are 


2 


selected as being from the same ethnic group. 


I 


22. 


The method of claim 1, wherein the subpopulation of subjects are 


2 


selected for common phenotypic trait. 


1 


23. 


The method of claim 1, wherein the subpopulations from the 


2 


treatment and control populations each include at least 5 members. 



1 24. The method of claim 23, wherein the subpopulations each include 

2 at least 10 members. 



1 25 . The method of claim 24, wherein the subpopulations each include 

2 at least 100 members. 

1 26. The method of claim 1, wherein the polymorphic profile for each of 

2 the subpopulations is a single polymorphic form. 

1 27. The method of claim 1 , wherein the polymorphic profile for each of 

2 the subpopulations comprises a plurality of polymorphic forms. 



61 



wo 00/33161 



PCT/US99/28582 



^esinamet*oUcpath«y. 

potential method for trealmg a d-sease an 
correlated with Uie disease. 

,_„e.ir:::r:— 

,Ke«e.hodofo,ai.n..w.«^*«^'»---"«^'-«'"'"'^°' 
taetodes at least 10 polymorphic fomB. 

33 ,,e„ethodotolaimUwh=rein.hepolymorphiepro«cf.each 

1 subpopulationsareatlcastloy-idenUcal. , 

. f i«™ 34 «h(«in the polymorphie profit tor the 
35. ThemelhodofclaimM.wnHeu' 

subpopulationaareatleastSOSidentioal. 

3, u.n,e.hodofc,.im35,«h.rein.>.epolymorphieproii,esfor.he 

\ s«bpopula«on.areatleast75%idcnUcal. 

wherein the polymorphic profiles for the 



t disease. 

2 the subpopulations 



37. The method of claim 36, 

in the test parameter is a measure of! 

3g. The memoQ oi 

2 disease. 



2 subpopulations are identical. 

38 ThemethodofclaimUwherein 

1 
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3,. Thcm..hodofdaim38,»h«eind»db«sei.«.o»:. 
elovattdsenim cholesterol level. 

aKceptibility to ftost damage. 

„ A„c(hodfoteond.c.inBaoUmcal.rial.con,prismg: 

!i ta«„sa.rea.=dpopul.tioaofpa.le..sha™sad,se.«»..ha 

.,^j:aeorj...o.o.aa.«..gt.ed^a— 
"""""""^ .,eo.i„gasu^u.atlo,>ofpaUe„t.*.meaohof.he.«a.edand 

: ,....a.el..pop».«on3as..as.— o..ee.ea.o.t.ed«s» 

9 treating the disease. 

43 Theme,hodofclaim«,whe«in.hede^i»»g^~"'P"- 
.. ..herfl.e«i.a«a.isUcallyri^cantdilfereneetaat«.parame.er 

2 detennining whether there IS a 

3 between the subpopulations. 



1 



44. The 



n^ethod of claim'42, further comprising determimng a 



polymorphic profile for each patient in the 



treated and control populations before the 



2 

3 selecting step 



«, ueme.hodofclaln>42.wherei„.heeo„^.prooed»rei,wolves 



■pUcebotothemembersofthecontrolpopulaaoa 



2 administennga] 

The».hodofolain,4a.wh«ein.hed™gia.fWivein.«a.inga 
„a,e..ha«.hedUea.fo,«Mehtheclinical.HalisWngperfon„=d. 



1 

2 disease 
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J ♦fnrnssessinE a treatment procedure, 
48 Acomputerprogramproductforassessmga 

5 heated according to a tr«tm=nt P«««lure»d 

, t«a.cdacco,dU,g,oa<».ro.p,^^^^^^^^^^^^_^^ 

; ^e,reatedandco«.»lpop«U.^^^^^,^_,„^,^,„^ 
: -'=-a^-->^^°^^„,„,„,...pop.>ation^ 

«.r«Pter between the subpopulations; 
Z l.pu.«teada..es.„»ge™«>i™nro..o,d.«g.h.codes. 
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1 49. A system for assessing a treatment procedure, comprising: 

2 (a) . a memory; 

3 (b) a system bus; and 

4 (c) a processor operatively disposed to 

5 (i) provide or receive data comprising 

6 designations for each member of a treated population 

7 having been treated according to a treatment procedure and each member of a control 

8 population treated according to a control procedure; 

9 designations for a polymorphic profile for each member of 

10 the treated and control populations; and 

1 1 designations for a test parameter for each member of the 

12 treated and control populations; 

13 (ii) select a subpopulation from each ofthe treated and control 

14 populations that have a similar polymorphic profile; 

15 (iii) determine whether there is a statistically significant 

16 difference in the test parameter between the subpopulations; and 

1 7 (i v) display an output of the result from step (iii). 

1 50. A method of conducting a clinical trial, comprising 

2 (a) determining a polymorphic profile for individuals in a population 

3 having the same disease, wherein the polymorphic profile includes at least one 

4 polymorphic form at a polymorphic site not known to be associated with the disease; 

5 (b) selecting a subpopulation of individuals having a similar 

6 polymorphic profile from the population; 

7 (c) administrating a treatment regime to a treatment group within the 

8 subpopulation and a control regime to a control group within the subpopulation; 

9 (d) determining a test parameter in patients in the treatment group and 

10 the control group expected to vary in response to an effective treatnnent regime; and 

1 1 (e) determining whether the parameter shows a statistically significant 

12 difference between the treatment group and the control group. 
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51 A method of conducting a clinical trial, comprising: 
\ (a) . detennining a polymorphic profile for individuals in a population 

3 having the same disease; 

(b) identifying subsets ofindividuals in the population such that the 

\ individualsinasubsetshowgreatersimilarityinpolymorphicprofilethanind^ 

6 different subsets; . . , 

(c) allocating the members of each subset to treatment and control 

subpopulationsofthepopulationssothatthetreatedandcontrolsubpopulationsea^^ 

9 receive at least one individual from each subset; 

(d) administrating a treatment regime to the treatment subpopulation 
and a control regime to a control subpopulation; 

(e) detennining a parameter in patients in the treatment subpopulation 
,3 andthecontrolsubpopulationexpectedtovaryinresponsetoaneff^tivetreato^^ 

14 regime; and 

^3 (f) determining Whether the parameter Shows a statisucally significant 

,6 differencebetweenthetreatmentsubpopulationandthecontrolsubpopulation. 
1 52. The method of claim 51, wherein the subsets are pairs ofindividuals. 



8 



10 
11 
12 
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(54) TOe: METHODS TO REDUCE VARL^NCE IN TREATMENT STUDIES USING GENOTYPING 
(57) Abstract 



The present invention provides methods, computer programs 
and computerized systems useful for evaliudng the efficacy of various 
types of treatment procedures (e.g.. clinical trials) as a function of the 
genotype of a subject By matching treatment and control groups 
genetically (100) the methods and systems of the invention reduce the 
total variance of the study, thereby allowing trials examining the 
efficacy or effect of treatment procedures to be conducted with fewer 
subjects, with increased confidence values, and/or with increased 
precision or discriminatory power. Certain methods of the invention 
involve selecting treated and control subpopuladons of subjects from 
treated and control populations for similarity in polymorphic profile 
(102), wherein the treated and control populations have been treated 
with a treatment and control procedure, respectively. A determination 
is then made whether there is a statisti(^ly significant difference 
(104) in a test parameter between the tr^ued and control 
subpopulations as an assessment of the test procedure (108). 



PROVIDE INDENTIFIER. POLYMORPHIC 
PROFILE AND TEST PARAMETER FOR 
EACH MEMBER OF A TREATED AND 
CONTROL POPUUTION 



100 



SELECT SUBPOPULATIONS FROM 
TREATMENT AND CONTROL 
POPULATIONS FOR SIMILARITY 
IN POLYMORHIC PROFILE 



DETERMINE IF THERE IS A 
STAnSTICAUY SIGNIRGANT 
DIFFERENCE IN THE TEST 
PARAMETER BETWEEN 
SUBPOPULATIONS 



DISPLAY RESULT OF 
DETERMINING STEP 



^02 



IS 

THERE A 
STATISTICALLY 
SIGNIFICANT DIFFERENCl 
IN TEST PARAMETER FOR 
THETWOSUB- 
J»OPUUVTIONS 
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